Exploiting Partial Information in Taxonomy Construction

نویسندگان

Rob Shearer

Ian Horrocks

Boris Motik

چکیده

One of the core services provided by description logic (DL) reasoners is classification: determining the subsumption quasi-ordering over the concept names occurring in a knowledge base (KB) and caching this information in the form of a directed acyclic graph known as the concept hierarchy or taxonomy. For less expressive DLs, such as members of the EL family, it may be possible to derive all the relevant subsumption relationships in a single computation [Baader et al., 2005]. In general, however, it will be necessary to “deduce” the subsumption relation by performing individual subsumption tests between pairs of concept names. For n concept names this will, in the worst case, require n tests, but for the tree-shaped hierarchies typically found in realistic KBs much better results can be achieved using algorithms that construct the taxonomy incrementally by traversing the partially-constructed taxonomy in order to find the right place to insert each concept name. This kind of algorithm suffers from two main difficulties. First, individual subsumption tests can be computationally expensive—for some complex KBs, even state-of-theart reasoners may take a long time to perform a single test. Second, even when subsumption tests themselves are very fast, a knowledge base containing a very large number of concepts1 will obviously result in a very large taxonomy, and repeatedly traversing this structure can be costly. The first difficulty is usually addressed by using an optimized construction that tries to minimize the number of subsumption tests performed in order to deduce the subsumption relation. Most implemented systems use an “enhanced traversal” algorithm due to Ellis [1991] and to Baader et al. [1994] which adds concepts to the taxonomy one at a time using a two-phase top-down and bottom-up breadth-first search of the partially-constructed taxonomy. The algorithm exploits the structure of the KB to identify “obvious” subsumers (so-called told-subsumers) of each concept, and uses this information in a heuristic that chooses the order in which concepts are added, the goal being to construct the taxonomy top-down; it also exploits information from the topdown search in order to prune the bottom-up search.2 The second difficulty can be addressed by optimizations that try to identify a subset of the concepts for which complete information about the subsumption relation can be

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Taxonomy Construction Techniques – Issues and Challenges

For any information to be organized, taxonomy is essential. Taxonomy plays a very important role for information and content management. Also it helps in searching of content. The most common method for constructing taxonomy was the manual construction. As the information available today is huge, constructing taxonomy for such information manually was time consuming and maintenance was difficul...

متن کامل

Hierarchical Taxonomy Extraction by Mining Topical Query Sessions

Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from q...

متن کامل

Using Taxonomic Background Knowledge in Propositionalization and Rule Learning

Knowledge representations using semantic web technologies often provide information which translates to explicit term and predicate taxonomies in relational learning. Here we show how to speed up the process of propositionalization of relational data by orders of magnitude, by exploiting such ontologies through a novel refinement operator used in the construction of conjunctive relational featu...

متن کامل

Resolving Task Specification and Path Inconsistency in Taxonomy Construction

Taxonomies, such as Library of Congress Subject Headings and Open Directory Project, are widely used to support browsing-style information access in document collections. We call them browsing taxonomies. Most existing browsing taxonomies are manually constructed thus they could not easily adapt to arbitrary document collections. In this paper, we investigate both automatic and interactive tech...

متن کامل

SIFT: An Algorithm for Extracting Structural Information From Taxonomies

In this work we present SIFT, a 3-step algorithm for the analysis of the structural information represented by means of a taxonomy. The major advantage of this algorithm is the capability to leverage the information inherent to the hierarchical structures of taxonomies to infer correspondences which can allow to merge them in a later step. This method is particular relevant in scenarios where t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Exploiting Partial Information in Taxonomy Construction

نویسندگان

چکیده

منابع مشابه

Taxonomy Construction Techniques – Issues and Challenges

Hierarchical Taxonomy Extraction by Mining Topical Query Sessions

Using Taxonomic Background Knowledge in Propositionalization and Rule Learning

Resolving Task Specification and Path Inconsistency in Taxonomy Construction

SIFT: An Algorithm for Extracting Structural Information From Taxonomies

عنوان ژورنال:

اشتراک گذاری